Journals
  Publication Years
  Keywords
Search within results Open Search
Please wait a minute...
For Selected: Toggle Thumbnails
Chinese speech segmentation method based on Gauss distribution of time spans of syllables
ZHANG Yang, ZHAO Xiaoqun, WANG Digang
Journal of Computer Applications    2016, 36 (5): 1410-1414.   DOI: 10.11772/j.issn.1001-9081.2016.05.1410
Abstract676)      PDF (957KB)(349)       Save
So far away, there is no accurate method for Chinese natural speech segmentation of syllables,which is meaningful in labeling speech with reference text instead of people. According to two hypotheses that time spans of Chinese syllables under the same pronunciation obey Gauss distribution and short-time energy valley exists between two adjacent syllables, Chinese speech segmentation method based on Gauss distribution of time spans of syllables was proposed. A simplified method based on distribution of energy valleys was given, which effectively reduced the time complexity of this speech segmentation method. The experimental results show that segmentation accuracy (mean square value of time spans between artificial labels and labels created by this method) achieve 10 -3 and computing times are less than 1 s in Matlab of PC.
Reference | Related Articles | Metrics
Unvoiced/voiced mode codebook design algorithm based on cellular evenness
XU Jingyun, ZHAO Xiaoqun, CAI Zhiduan, WANG Peiliang
Journal of Computer Applications    2016, 36 (12): 3374-3377.   DOI: 10.11772/j.issn.1001-9081.2016.12.3374
Abstract521)      PDF (589KB)(317)       Save
The parameter distribution of unvoiced/voiced Line Spectrum Frequency (LSF) has differences. In order to improve the quantization performance of LSF parameters in vocoder, an unvoiced/voiced mode codebook design algorithm based on Cell Evenness (CE) was presented by using the difference between unvoiced/voiced LSF parameters distribution and CE. Firstly, the optimal amount ratio of unvoiced/voiced LSF parameters participating in the codebook training was deduced according to CE. Then the specified number of atypia LSF parameters were eliminated from unvoiced speech. The final codebook was retrained. The experimental results show that, compared with the shared codebook algorithm under the same bit-rate condition, the average spectrum distortion of the proposed algorithm was reduced by 2.5%, the mean opinion score was increased by 2.3% and the storage of codebook was reduced by 21.1%. The proposed algorithm is also adapted to the vocoder without unvoiced/voiced symbol transmission and the algorithm is also adapted to the vocoder without unvoiced/voiced symbol transmission.
Reference | Related Articles | Metrics
Chinese speech segmentation into syllables based on energies in different times and frequencies
ZHANG Yang, ZHAO Xiaoqun, WANG Digang
Journal of Computer Applications    2016, 36 (11): 3222-3228.   DOI: 10.11772/j.issn.1001-9081.2016.11.3222
Abstract609)      PDF (1015KB)(478)       Save
Precise speech segmentation methods, which can also greatly improve the efficiency of corpus annotation works, are helpful in comparing voice with voice models in speech recognition. A new Chinese speech segmentation into syllables based on the feature of time-frequency-dimensional energy was proposed:firstly, silence frames were searched in traditional way; secondly, unvoiced frames were sought using the difference of energies in different frequencies; thirdly, the voiced frames and speech frames were looked for with the help of 0-1 energies in special frequency ranges; finally, syllable positions were given depending on the judgements above. The experimental results show that the proposed method whose syllable error is 0.0297 s and syllable deviation is 7.93% is superior to Merging-Based Syllable Detection Automaton (MBSDA) and method of Gauss fitting.
Reference | Related Articles | Metrics